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FUNCTIONAL CRAMER-RAO BOUNDS AND STEIN ESTIMATORS IN 
SOBOLEV SPACES, FOR BROWNIAN MOTION AND COX PROCESSES 

ENI MUSTA, MAURIZIO PRATELLI, AND DARIO TREVISAN 


Abstract. We investigate the problems of drift estimation for a shifted Brownian motion 
and intensity estimation for a Cox process on a finite interval [0,T], when the risk is given 
by the energy functional associated to some fractional Sobolev space Hq C C L^. 

In both situations, Cramer-Rao lower bounds are obtained, entailing in particular that no 
unbiased estimators with finite risk in Hq exist. By Malliavin calculus techniques, we also 
study super-efficient Stein type estimators (in the Gaussian case). 


1. Introduction 

In this paper we focus on two problems of non-parametric (or, more rigorously, infinite¬ 
dimensional parametric) statistical estimation: drift estimation for a shifted Brownian motion 
and intensity estimation for a Cox process, on a finite time interval [0,T]. Our investigation 
stems from the articles [PROS; PR09] where N. Privault and A. Reveillac developed an original 
approach to these problems, by employing techniques from Malliavin calculus and the so-called 
Stein’s method [JS61] to study Cramer-Rao bounds and super-efficient “shrinkage” estimators 
in these infinite-dimensional frameworks. Such a combination of these two powerful techniques 
fits into a more general picture, which only in the recent years has become clear (see the 
monograph [NPI2]) and is currently a very active research area, with impact on statistics (see 
e.g. [GobOl; CKHll; PRll; Liul3]) and, more generally, on probabilistic approximations. 

As in [PROS; PR09], we assume that the unknown function to be estimated belongs to the 
Hilbert space Rq( 0,T) (which is a reasonable choice, at least in the case of shifted Brownian 
motion, because of Cameron-Martin and Girsanov theorems) but we move further by address¬ 
ing the following question, which is rather natural but apparently was not considered: what 
about estimators which also take values in HqI Indeed, in [PROS; PR09], estimators are seen 
as functions with values in L^([0,T],|u) (where /r is any finite measure) or, equivalently, the 
associated risk is compnted with respect to the norm and not the (stronger) Hq norm. 

To investigate this problem, we first provide Cramer-Rao bounds with respect to different 
risks, by considering the estimation in the interpolating fractional Sobolev space Hq C c 
L?‘, for a G [0,1]. It turns out that no unbiased estimator exist in Hq (Theorem 2.5) and 
even in IP“’^, for a > 1/2 (Theorem 2.9). Although a bit surprising, these results reconcile 
with the following intuition: since the estimator is a function of the realization of the process, 
whose paths also do not belong to Hq (nor IP“’^, for a > 1/2), it is “too risky” to estimate 
(without bias) the parameter in that scale of regularity. Therefore, besides answering a rather 
natural question, our results highlight the delicate role played by the choice of different norms 
in such estimation problems, and one might expect that similar phenomena might appear in 
other situations, technically more demanding (e.g. SDE’s). 

The second and third authors are members of the GNAMPA group of the Istituto Nazionale di Alta 
Matematica (INdAM). 
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As a second task, we study super-efficient “shrinkage” estimators in the spaces It is 

often intuitively suggested the ideal situation for the problem of estimation would be to have 
an unbiased estimator with low variance, but allowing for a little bias may entail existence of 
estimators with lower risks, in many situations: this is the purpose of Stein’s method, and we 
rely on its extension and combination with Malliavin calculus to these frameworks developed 
in [PROS; PR09]. With a similar approach, we give sufficient conditions for super-efficient 
estimators in for a < 1/2, and we give explicit examples of such estimators, in the 

case of Brownian motion (Example 5.1). In the case of Cox processes, although it is possible 
to define a suitable version of Malliavin calculus and provide as well sufficient conditions for 
Stein estimators, we are currently unable to provide explicit examples. 

The paper is organized as follows. In Section 2 we deal with drift estimation for a shifted 
Brownian motion, addressing Cramer-Rao lower bounds with respect to risks computed in Hq 
and fractional Sobolev spaces. Analogous results on intensity estimators for Cox processes 
are given in Section 3. In Section 4, we recall notation and results for Malliavin calculus on 
the Wiener space. Finally, in Section 5, we discuss super-efficient estimators. 

2. Drift estimation for a shifted Brownian motion 

In this section, we fix T > 0 and let X = (W)te[o,r] be a Brownian motion (on the finite 
interval [0,T]), defined on some filtered probability space (D,T, (3't)tg[o,r])lF’)- As a (infinite¬ 
dimensional) space of parameters 0, we consider a set of absolutely continuous, adapted 
processes ut := Jq iis ds (for t G [0, T]) such that ('iit)tg[o,r] satisfies the conditions of Girsanov 
theorem: indeed, for tt G 0, we define the probability P“ := L“P, with 



and Girsanov theorem entails that, with respect to the probability P“, the process := 
Xt — Ut is a, Brownian motion on [0, T]. 

We address the problem of estimating the drift w.r.t. P“ on the basis of a single observation 
of X. This is of interest in different fields of applications: for example, we can interpret X as 
the observed output signal of some unknown input signal u, perturbed by a Brownian noise. 
Such a problem is investigated e.g. in [PROS], where the following definition is given. 

Definition 2.1. Any measurable stochastic process ^ : D x [0, T] — )■ M is called an estimator 
of the drift u. An estimator of the drift u is said to be unbiased if, for every u G G, t G [0, T], 
^t is P“-integrable and it holds E“ [ [ u* ]. 

In this section, we forgo to specify “of the drift u” and we simply refer to estimators. 
Moreover, we refer to the quantity E“[^i — ut] as the bias of the estimator ^ (whenever it is 
well-defined). 

By introducing as a risk associated to any estimator the quantity 

(1) IE“[ 1!^ - 1^* " Ut\'^^,{dt) ], 

where n is any finite Borel measure on [0, T], Privault and Reveillac provide the following 
Cramer-Rao lower bound for adapted and unbiased estimators [PROS, Proposition 2.1], 0 
being the space of all absolutely continuous, adapted processes, whose derivatives satisfy the 
conditions of Girsanov theorem. 
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Theorem 2.2 (Cramer-Rao inequality in L^(/i)). For any adapted and unbiased estimator ^ 
it holds 

(2 ) E“[||^-u||^ 2 (^)] > j tia{dt), for every ueQ. 

Equality is attained by the (efficient) estimator u = X. 

Before giving our results, let us briefly comment on some aspects of this inequality and its 
proof, in particular with respect to adaptedness of ^ and the role played by the exponent 2 . 

By direct inspection of the proof in [PROS], the requirement for f to be adapted is seen to 
be unnecessary. Indeed, the argument relies on an application of Cauchy-Schwarz inequality 
in the right hand side of the identity 

(3) vit) = E“ [ (ft - ut) [ v{s) dXf 1, for t G [0, T], 

L Jo 

valid for every deterministic process u G 0 (thus, v{t) := fg v(s) ds) and then choosing 
v{s) = l[o,t](s)- In turn, the proof of (3) uses fact that, for every e G M, it holds u + eu G 0, 
thus 

E“+^^[et] =E^+^^[ut + ev{t)] =E^+^^[ut]+sv{t), for f G [0,r]. 

and differentiates with respect to e at e = 0 (exchanging between differentiation and expec¬ 
tation is justified by the finitness of the left hand side in ( 2 ), otherwise there is nothing to 
prove): 

de £=o L de £=o J 

= E“[(et-ut) [ vis)dXf . 

Jo 

Let us also notice that it is not necessary for 0 to be the whole set of drifts u such that 
Girsanov theorem applies to u, and the following condition is sufficient: for every n G 0 and 
deterministic u G 0, it holds u -|- u G 0. 

Remark 2.3. Back to the problem of adaptedness of it would be desirable to argue that 
general (not-necessarily adapted) estimators can not perform better than adapted ones, and 
the following argument might seem to go in that direction, but does not allow us to conclude. 
Let ^ be any unbiased estimator and for n G 0, consider the optional projection 77 of with 
respect to the probability so that pt := E“[^t | ], for t G [0,T]. Then, £“[ 77 ^] = ut and 

it holds 

E^[\rit-utf]=E^[E^[^t-ut\3^t?] <E^[\Ct - ut\^]. 

However, this does not entail that 77 performs better that since 77 = 77 “ depends also on 
u, thus it is not an estimator. On the other side, if we keep u G 0 fixed, then 77 “ could be 
biased, i.e. £“[ 77 “] / E^[ut] for some u G Q, t G [0,T]. 

Remark 2.4. Similarly to the mean squared error, one can consider the risk defined by 
norms, for p G (l,oo): 

\nKdt). 
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Again, by direct inspection of the proof in [PROS], applying Holder inequality (with conjugate 
exponents {p,q)) instead of Cauchy-Schwarz inequality in (3), we obtain an inequality of the 
form 

E”[l& - ui-] > tor t e [O.n, 

where Cq := Edyl”?] is the g-th moment of a A^(0,1) random variable Y. Integration with 
respect to p then provides a Cramer-Rao type lower bound. However, letting ^ = X, one has 

E^[\Xt - Ut\n fort G [ 0 ,T], 

thus X is not an efficient estimator in U’{Q x [0,T]) for p ^ 2. 

In all what follows, we let Hq{= Hq( 0,T)) be the space of (continuous) functions in the 
form h{t) = fgh(s)ds, for t G [0,r], with h G L^(0,T) (usually called, in this context, the 
Cameron-Martin space), and we assume that, for every u G Q, h G Hq, it holds u + h G Q. 
The Hq “energy” functional, namely jj/ill/^i := ||^||L 2 (o,r) provides a Hilbert norm on Hq. 
For simplicity of notation, we extend such a functional identically to +oo for any Borel curve 
/i : [0,T] ^ R which do not belong to Hq. 

We notice that Hq is included in C^/^(0,T), the space of 1/2-Holder continuous functions: 
since the paths of the Brownian motion are not in 1/2-Holder continuous, we deduce that 
the process X is not Hg-valued (negligibility of the Cameron-Martin space holds true also for 
abstract, infinite-dimensional, Wiener spaces). However, since the drift u takes values in Hq, 
it is natural to look for an estimator ^ sharing this property. Our first result shows that, if we 
require ^ to be unbiased, this is not possible, i.e. such an estimator ^ has necessarily infinite 
Hq risk. 

Theorem 2.5 (Estimators in Hq). Let ^ he an estimator sueh that, for some u G Q, it holds 

E“[||C - u\\hi ] < oo. 

Then, ^ is not unbiased. 

Before we address the proof for general, possibly non-adapted, estimators, we give the 
following argument that exploits Ito formula: actually it is longer, but we feel that it is more 
of stochastic flavor. 


Proof. (Case of adapted estimators.) Let us assume, by contradiction, that ^ is unbiased, 
thus by difference, ^ G L^(H,P“;Hq). For every (deterministic) v G Hq, arguing as above for 
the deduction of (3), we obtain that 

u(t) = f {^s — Us) ds f v{s)dXf , fortG[0,T], 
wo Jo 

where stochastic integration reduces to the interval [0, t] because of the adaptedness assump¬ 
tion. Integrating by parts (i.e., using Ito’s formula) we rewrite the random variable above 
as 


v{r)dXf) {is -Us)ds + 


(^r — Ur) dr) v{s) dX. 


obtaining the right analogue of (3) for the study of Hq energy: 


v{t) = E" 



v{r) dXf) {is — iis) ds , fort€[0,T]. 
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Indeed, Cauchy-Schwarz inequality and Ito’s isometry give 


vitf < 


0 Jo 
■•. 2 / 


v{r) dX^Y ds 


E^ 


{is - Usf ds 


10 ''JO 

rt 


v^{r) dr) ds / E^[ — iisY ] ds 


= I {t - 3 ) 1 ;“^{s)ds I E“[(^5 - n^)^] ds. 


In particular, choosing t = T, we deduce 

E“[iie-u"2 


v{Tf 


m 


fo (T-t)v^(t)dt 

To obtain a contradiction, it is enough to prove that for every constant c > 0, there exists 
V € L^(0,T) such that the left hand side above is greater than c, i.e., 

2 

/ /" \ 

(4) 




f v{t) dA — ^ dt. 

Jo ) Jo 

Indeed, if we let v{t) = i^rpY)a for some 0 < a < 1, it holds 


v{t)d?j = and ^ {T -t)J{t)dt = 


2(1 - a)' 


It is then sufficient to let a t 1 to conclude. 


□ 


Remark 2.6. Instead of the explicit construction of u G idg above, to obtain a contradiction 
we can also use the following duality result. On a measure space (E, £,;u), if > 0 is a 
measurable function such that, for some constant c > 0, it holds 

J fgdn<c(^J Jdfi) , for every / G (//),/> 0, 

then it holds g G with < c. The easy proof follows from considering the 

continuous, linear functional (p initially defined on L°° n T^(/u) by / J^fgdfi and then 
apply Riesz theorem on its extension to T^(//). 

In the proof above, a contradiction immediately follows from (4), letting Jdt) = {T — t) dt 
and g{t) = (T — t)~^. 

We now provide a complete proof of Theorem 2.5. 

Proof. (General case.) Arguing by contradiction, we let f G L^(n, idg). For every (deter¬ 
ministic) V G idg, arguing as above for the deduction of (3), we obtain instead 


v{t) = E^ 


L Jo 


(is — Us) ds / v{s)dXf , fortG[0,T]. 


Then, we differentiate with respect to t G [0, T] (exchanging derivatives and expectation is 
ensured by the finite risk assumption), and we obtain, for a.e. t G [0,T], 


v{t) = E“ {it - lit) / -u(s) dX'^ 
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At this stage, Cauchy-Schwarz inequality and Ito isometry yield 

. rT 


(5) 


|u(t)|2<IE“ \it-ut 


v{s)\‘^ds, for a.e. t G [0,T], 


Jo 

From this inequality, we easily obtain a contradiction, arguing as follows. Let A C [0, T] be a 
non-negligible Borel subset such that — ut\‘^]dt < 1, which exists because of the finite 

risk assumption and uniform integrability (notice that A does not depend upon v). Then, 
integrating the above inequality for t £ A, we obtain 


v{t)\‘^dt < / lit —UiP dt / \v{t)\‘^dt, 


for every v G L^(0,T), in particular for every v G L?‘{A). Simply taking v = 1 a, we obtain 
the required contradiction. □ 


Actually, the result on the absence of unbiased estimators in Hq can be slightly strength¬ 
ened, allowing for estimator whose bias is sufficiently regular. We state it as a corollary (of 
the proof), remarking that similar deductions could be performed also in the cases that we 
consider below. 


Corollary 2.7. Let^ be an estimator such that, for every u G Q,t G [0,T], isF'^-integrable, 
and it holds, for some C = (C't)^£[o^'r] G L‘^{0,T) (possibly depending upon u G Q), 

<Ct\\v\\Hi, a.e.tG [0,T], for every v G Hq. 

Then, the Hq risk of the estimator is infinite, i.e. 

11^ “ ^11^1 ]ds = oo, for every u G Q. 

Proof. We argue exactly as in the proof above, but we write 

=E“+"’'[ut] +eu(t) + 6“+"T 

where bf := E“[^t — ut] is the bias. After differentiation with respect to e and t, we obtain 
(5) with E“[|^t — -|- Cf in place of E“[|,^t — ut\'^] and we conclude arguing as in the proof 

above. □ 


dt de 


£ = 0 


E-u+ei) [^t Ui 


We address now analogous results for the intermediate spaces Hq C C L^, for a G 

(0,1), defined as follows. 

Definition 2.8. For a G (0, 1), p G (l,oo), the fractional Sobolev space PF"’^’(= kF“’^’(0,T)) 
is defined as the space of functions u G LP{0,T) such that their “energy” functional 


u 




/■^ \ut - Us\P 

Iq |t-sr+i 


dt ds 


is finite. 


We refer to [DNPV12] for a survey of the theory of fractional Sobolev spaces, although here 
we need nothing more than the definition above. The space endowed with a suitable 

norm, interpolates (in the sense that could be made precise) between the Sobolev space VF^’^ 
and for example, it holds VF"^’^ C VF"’^ for 0 < a < a' < 1, and IF"’^ C H^, with 

t-T j-T r-r 2 

(6) ||u||^c ,2 < 2 / \Urf / 77 - y^dsdtdr <Ca,T\\u\\\i. 

0 Jo Jr Jo F “ 'S| ° 
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From this inequality, the above theorem for estimators in Hq could be also obtained by the 
next results. 

Let us first consider the Cramer-Rao bound in the quadratic case. 


Theorem 2.9 (Cramer-Rao inequality in Let ^ be an unbiased estimator. For every 

a G (0,1), it holds 




11^-^1 




> 


\t — s\ 


2a 


dt ds. 


for every u £ 


Equality is attained by the (effieient) estimator f = X. 


In particular, if an estimator ^ has finite risk for some a G [1/2,1) and u G 0, then 

it is not unbiased. 


Proof. We introduce the notation ;= ft — ut, for t G [0, T], so that, by Fubini theorem, we 
write 

/-^EniAi-A,|2 1 


E^ 


11^-^1 




|2o+l 


■ dt ds. 


Jo Jo 1^ “ 'S| 

If f is an unbiased estimator and v G Hq, we argue (once again) to obtain (3), and subtract 
such identity for s, t G [0, T], thus 


v{t) - v{s) = E“ (At - As) / v{r) dX, 


Hence, Cauchy-Schwarz inequality and Ito isometry give the lower bound 

E“[|Ai-A,|2]>^^g)_^, fors,tG[0,T]. 

Jo V [s)ds 

We let v{r) = so that 

E“[|At-A,|2] > |t-s| for s, t G [0,T]. 

The Cramer-Rao then follows: 


rT rT 


E“[|Ai-A,|2] 


rT rT 


dtds > 


1 


|f — s 


2a 


Jo Jo |t — JQ JQ 

Finally, if ^ = X, then X — u = X“, thus it holds 

E-[\Xf-Xf\^] = \t-.s\, fors, tG[0,r] 
and the Cramer-Rao lower bound is attained: 


dt ds. 


rT rT 

|t-s|2“+l 


lo Jo 


rT rT 


dt ds = 


1 


10 Jo 


|f — s 


2a 


dt ds. 


□ 


In the case of a general exponent p G (l,oo) (with q = p/{p — 1)), arguing similarly, we 
obtain the following bound, in IF“’^. As above, we let Cg = E[|y|'^] be the g-th moment of a 
standard Gaussian random variable. 
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Theorem 2.10 (Cramer-Rao inequality in Let ^ be an unbiased estimator. For every 

a G (0,1), p G (1, oo), it holds 






> 


^/q pmax{0, (l/2-a)}(l-hp(l/2-a))' 


Since 


E-[\Xr-X^\P]=Cp\t-s\P/^ 
the risk of the estimator .^ = X is given by 

Jo Jo |t-sr+i dtds-c, 

As in Remark 2.4 above, we conclude that X is not an efficient estimator with respect to the 
risk in W^’P, for p ^ 2. 


1 


dt ds. 


Remark 2.11. Before we conclude this section, we remark that all the bounds above can 
be generalized (at least) to the case of a continuous Gaussian martingale, with quadratic 
variation process fg ds, t G [0, T] and also by introducing different energies, such as 


r r \u{t)-u{s)\P 

0 Jo \t - 


fi{dt, ds), 


where is a measure on [0,T] (a natural choice would be to take p somehow related to 
fj^). However, we choose to limit the discussion to the case of the Brownian motion, to limit 
technicalities and emphasize the role played by the norm chosen to estimate the risk. 


3. Intensity estimation for the Cox process 

Throughout this section, we fix T > 0 and let X = {Xt)t£[o^T] be a Poisson process defined 
on some filtered probability space (HjT, (3't)tg[o,r])lF’)) with jump times (rA:)A:>i (for k > 1, 
we let Tk{co) = T in the eventuality that no fc-th jump occur). As a space of parameters 0, 
we consider the set of all absolutely continuous, (strictly) increasing, To-measurable processes 
u = (rtt)tg[o,r] such that their a.e. derivatives (iit)tg[o,r] satisfy the assumptions of Girsanov 
theorem for the Poisson process (the proofs work also for slightly smaller sets). Given u £ Q, 
we define the probability P“ ;= L“P, where 

Xj' 

:= iiT^ exp - 
k=l 

Girsanov theorem entails that, with respect to the probability P“, the process X is a Cox 
process with intensity (ftt)te[o,T] (see e.g. [JYC09, Section 8.4] for details on related doubly 
stochastic Poisson processes). Notice that P“(A) does not depend on m for A G To, thus e.g. 
for t G [0,T], u G 0, ut is integrable with respect to P" and its expectation E^[ut] actually 
does not depend on v. 

We address the problem of estimating u, or equivalently the intensity of X w.r.t. P“, based 
on a single observation of X. In the case of a deterministic intensity, i.e when X is an 
inhomogeneous Poisson process, this is investigated e.g. in [PR09], and, similarly to the case 
of shifted Brownian motion, the following definition is given. 
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Definition 3.1. Any measurable stochastic process ^ : Qx [0,T] —)■ M is called an estimator 
of the intensity u. An estimator of the intensity u is said to be unbiased if, for every u £ Q, 
t G [0, T], is integrable and it holds ]E“[^t] = 

As in the previous section, we forgo to specify “of the intensity u” and simply refer to 
estimators. 

Privault and Revelliac studied the estimation problem, in the case of deterministic inten¬ 
sities, w.r.t. the risk in L^(^), defined as in (1), for any finite Borel measure on [0,T]. Their 
set of parameters 0 consists of all the space of deterministic absolutely continuous, increasing 
processes u, see [PR09, Definition 2.1], We briefly show how a similar argument indeed applies 
as well to the case of stochastic intensities. 


Theorem 3.2 (Cramer-Rao inequality in L^(/i)). For any unbiased estimator it holds 

E“[||C-''^IIl 2 (^)] > y ¥F[ut]n{dt), for every uGQ, 
and equality is attained by the (efficient) estimator f = X. 


Proof. For every process v £ Q, since f is unbiased we have 

=E^+^^[ut + evt] = E^+^^[ut] + sE^+^^[vt], for t £ [0,r]. 
Differentiating w.r.t. e, as in in [PR09, Proposition 2.3] we obtain the identity 

d 


(7) 


de 

= 




£ = 0 


(& - ut) 


u. 


{dXg — iis ds) 


By Cauchy-Schwarz inequality and the fact that A is a Cox process with intensity ii, we get, 
for t £ [0, T], 


rT ^2 

^ds 


L Jo Us 

once we let v = ii l[o,t]- The thesis follows by integration w.r.t. fi. 


thus E“[(^t - ut)^] > E“[ut], 


□ 


Differently from the case of Brownian motion, the lower bound depends on the parameter 
u £ Q. This is quite natural in view of the classical, finite-dimensional, Cramer-Rao lower 
bound, where the inverse of the Fisher information appears, measuring the local regularity 
of the densities; when u is small, the density becomes very peaked and the bound becomes 
trivial. 

Since the intensity u G 0 is absolutely continuous, also in this case we investigate lower 
bounds for the Hq risk: also in this case, no unbiased estimators exist. In the next result, we 
also collect the case of fractional Sobolev spaces for a £ (0,1). 


Theorem 3.3. For any unbiased estimator a £ (0,1), it holds 

y dsdtdr, 

for every u £ Q. There exists no unbiased estimator f with finite risk in for a £ [1/2,1), 
as well as in FJ^. 
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Proof. We subtract (7) for two different times s, t G 
obtaining 


IE“[|Ai-A,|2] > 




E“ 


fc 


0 ila 


ds 


Hence, taking Vr = l[sAt,svi] (’’) W, we have 


and apply Cauchy-Schwarz, 


E“[|Ai - Asp] > E“[|nt - Us \], for every s, t G [0,r]. 


If s < t, then the right hand side above coincides with E“[ Uj-dr]. Integrating with respect 
to s, t G [0,T], with measure \t — s\~‘^°‘~^dtds, we obtain the required inequality. To deduce 
that no unbiased estimators with finite risk exist, it is sufficient to notice that the double 
integral equals +oo, for a G [1/2,1), and E[ttr] > 0 for a.e. r G [0,T]. The case of Hq follows 
at once from inequality (6). □ 


4. Stochastic calculus of variations 


In this section, we briefly recall some results concerning Malliavin Calculus on the classical 
Wiener space (we refer to the monograph [Nua06] for details), limiting ourselves the essentials 
for constructing super-efficient estimators. 

In the framework of Section 2, i.e. if A = (At)jg[o,T] is a Brownian motion (on the finite 
interval [0,T]), defined on some filtered probability space (n,3“, (3“t)tg[o,T])iF’)) we introduce 
the space § of smooth functionals, as those in the form 


for some ti,... ,tn G [0,T] and (j) G C^(M"') (n > 0). The Malliavin derivative DF is then 
defined as the L^(0, T)-valued random variable 


•= ^ W’ for a.e. t G [0,T]. 

i=l * 


For h G T^(0, T), we let D^F := DtF h{t)dt (in the classical Wiener space framework, this 
corresponds to differentiation along the direction in given by h{t) = h(s)ds, t G [0,T]: 
differently from the previous sections, we prefer to focus on the space T^(0, T) instead of H^). 
The Cameron-Martin theorem entails the following integration by parts formula for smooth 
functionals. 


Proposition 4.1. Let F G § and h G L^(0,T). Then, it holds 

( 8 ) E[DhF]=E[Fh*], 

where we let h* = h{s)dXs be the Ito(-Wiener) integral. 

A straightforward consequence of the integration by parts formula above is closability for 
the operator D : § C L‘^{PL) —)• L?‘{Ll x [0,T]). The domain of its closure defines the Sobolev- 
Malliavin space D^’^, on which the operator D extends continuously. 

Proposition 4.2 (chain rule). Let Fi,...,Fn G and (j) G C'/(M”'). Then, it holds 

(l){Fi, ..., Fn) G with 

Dtf{Fi,...,Fn) = ^^{Fi,...,Fn)DtFi, for a.e. t G [0,T]. 

■ dxi 
2=1 
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Remark 4.3 (Malliavin Calculus for a Cox process). It seems reasonable to develop a theory of 
differential calculus for Cox processes, akin to that for Poisson processes introduced [PR09]: 
in the setting of Section 3, i.e., if we let (^t)tg[o,r] be a Cox process on (3't)te[o,r])lF’)) 

with intensity A = (At)tg[o,r] and jump times (rA:)A:>i- Then, we let S be the space of random 
variables F in the form 


F — fo 1{Xt= 0} + l{XT=n} /n(Tl, • • • , Tn), 
n=l 

where, for n > 0, fn ■ klx M"" —)■ M is bounded, measurable with respect to To (i.e. its 

randomness depends only on A) and for every lo G Q, •) is and symmetric, i.e., 

ti,... ,tn) is left unchanged by any permutation of the coordinates (ti,..., tn) and that, 
for every n > 0, it holds /n(a;;ti, ■ ■ ■ ,tn) = /n+i(a;;ti, ... ,tn,T), for w G fl, ti, ...,G M. 
For F G S, we may let DF{oj) G L^(0,T) 

oo n ^ 

F>tF ^ l{XT=n} ^ l[o,Tfc](^) T — dkfn{Ti, .. . ,T„) Ai, 

n=l fc=l 


for a.e. t G [0, T]. 

One can prove the validity of the chain rule and an integration-by-parts formula, providing 
some notion of divergence, thus defining Sobolev-Malliavin spaces in this setting. However, it 
is presently not clear how to effectively use such calculus to produce super-efficient Stein-type 
estimators, see Remark 5.2 below. 


5. Super-efficient estimators 

In this section, we address the problem of Stein type, super-efficient estimators for the 
drift of a shifted Brownian motion, with respect to risks computed in the Sobolev spaces 
introduced above. 

For L^(^)-type risks, super-efficient estimators in the form X + ^ were first studied in 
[PROS]. Privault and Reveillac consider a process logF, t G [0,r], where F is any 

P-a.s. non-negative random variable in such that VF is A-superharmonic w.r.t. a suitable 
“Laplacian” operator, actually related to the structure of the risk considered (which is not, 
in the Gaussian case, the usual Gross-Malliavin Laplacian). We show that a similar approach 
leads to super-efficient estimators also in fractional Sobolev spaces for a G [0,1/2) (of 

course, this perturbative approach does not provide any information for larger values of a). 
Indeed, for every ^ = (6)tG[o,T], with E“[||^||^ 2 .a] < oo, we write 


E^[\\X + C-u\ 




a,2 


= 




Wk 


+ 


+ 2 


I E“ [ (6 - 6) [{Xt - Ut) - {Xs - U,)] ] dfiais, t), 


where we introduce the Borel measure fia{ds,dt) = 2 (t — s)' ^l^g^f^jdsdt on [0,T]^. If 
— for every s, t G [0, T], with s < t, the integration by parts (8) for the Malliavin 

derivative (to be rigorous, we should write in what follows because the derivative is built 
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with respect to the probability not P), entail 

E“ [ (6 - 6) [(^t - ut) - {Xs - ns)] ] = E“ [ (6 - 6) {xr - X^) ] 

= E“ 

= E“[^,,t(6-6) • 

where Ds,tF := D^-F dr. Hence, if we let p = E“[||X—n||^„ 2 ] denote the Cramer-Rao 

lower bound, we deduce 

E“ [||X + e - =p + y E“ [\Ct - 61" + 2 Ds,t{^t - 6)] f^a{ds, dt). 

It is then convenient to introduce the following notion of Laplacian, 

(9) A„F := [ {Ds,tfFpa{ds,dt), 

j[o,r]2 

initially defined on S. Arguing e.g. as in [PROS, Proposition 4.5], it is possible to show that 
Aq : S C L^(n,P“) —)■ L2(n,P“) is closable and that the random variables G G D^’^, with 

(10) Dg^tG G D^’^, for a.e. s, t £ [0,T] and D^^G G (hi x [0,T]^,P x pa), 

belong to the domain of the closure, so that A^G is well-dehned (actually, by the same 
expression as in (9)). Moreover, the operator A^ is of diffusion type, i.e., for every Fi,..., Fn G 
S, 0 G C'^(M"'), the function (poF (we write F = {Fi ,..., Fn)) belongs to the domain of A^, 
and it holds 

(11) A„(,Aor) = ^^(F)A„Fi+ ^ ^^{F)r„(Fi,F,), P-a.e. in fi, 

with Ta{Fi,Fj) = J^Qrp -^2 F>s^tFiDs^tFjPa{ds,dt), for i, j G {l,...,n} (the Malliavin matrix 
associated to (Fi)[h^). This identity, by density, extends under natural integrability assump¬ 
tions on F as well as on (f). 

The operator Aq enters in the picture if we assume that process ^ is of the form 6 = 
T)o,tlogF^, t G [0,T], for some P-a.e. positive random variable F G D^’^, with G = logF^ 
satisfying (10). If we are in a position to apply the chain rule (11), it holds 

A^logF^ = 2^-^rUF,F) 

= - ^ra(logF2,logF2) 

which can be explicitly written in terms of ^ as 

——= / 2Z)^,t(6-6) + 16-61" Pa{ds,dt). 

^ 4[0,T]2 L 

As a result, we obtain 

E“[||A + ^-w|| 2 2l=p + 4E“ ^ . 

^^0 J L 

Therefore, in order to find super-efficient estimators, it is enough to prove existence of some 
^ (independent of u) that can be written in terms of some F (possibly depending on u), with 
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AqF < 0 (i.e., super-harmonic) with strict inequality on a set of positive (or equivalently 
P) measure. In case of shifted Brownian motion, we provide the following 


Example 5.1. Let E be a r.v. of the form F = Xt^ — Xf-^ , • • •, Xt„ — Xt^_^), for some 

0 = to < ti <...< tn < T (with (f) : M” ^ sufficiently regular, in order to perform all 
the computations below). Then, by (11), we can express X^F in terms of V(/>, V^(/>, Aa{6iX) 
and 

Ta{SiX,5jX)= [ Ds,AXDs,tSjXfj,a{ds,dt), for i, j G {1,..., n}, 

•t[o,r]2 

with the notation SiX = Xt. — Xt._-^^. 

Before we proceed further, we have to take into account that, with different probabilities 
P“, the r.v.’s may have different derivatives DF = D'^F and Laplacians A^F = A^F, since 
the calculus w.r.t. P“ is “modelled” on the process = X — u, thus, for h G L^(0,T), 
t £ [0, T], it holds 


DhXt = DhXf + DhUt = [ h{s)ds + DhUt 

Jo 


and 


AaXf — Aq.X^ A(^ut — Aq-u^, 

provided that ut is sufficiently regular. To proceed further with computations, we assume 
that the process u is deterministic i.e. we restrict the space of parameters 0 to Hq only, so 
that DhUt = AaUt = 0, ruling out the problem of possible dependence upon u of the Malliavin 
calculus that we consider. Then, (11) reduces to 


A„F= ^ 


*J = 1 


dxidxj^'''^' 


where, for i, j G {1,..., n}, with to = 0; 


ai,j 


'[o,r ]2 


Mu_uU]ir)dr / lit^_^^tj]{r)drfia{dt,ds). 


To prove that the symmetric matrix A := is well-defined and invertible, we argue 

as follows: for every v = (ui))Li) holds, using the notation {Av,v) := 


{Av, v) 





'[o,r]2 





[r)dr 


2 

Ha{dt, ds) 



v{s)\‘^fia{dt, ds) 


'w; 


a,2 5 


where we let v{t) = /q l[ti_i,ti]{s)vids. From this identity and (6) we deduce that A 
is well-defined, while non-degeneracy follows from the fact that, if ||i;||p^a ,2 = 0, then v is 
constant, which cannot happen except when u = 0. 

We let B := (6ij)”j=i be the inverse matrix of A, and consider the function 

(/)(x) := (Bx, x)“ , X G M”, 
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for a suitable choice of a G M. Then, by formally applying the the chain rule in R”, it holds 


E 


d'^cj) 

dx^dx^ 


rttij = 2a(2(a — 1) + n) {Bx, x) 


a—1 


which suggests the choice a G (1 — n/2,0) (and n > 3). However, for a in this range, 4* is 
not ^^(R”) and in order to rigorously conclude super-efficiency for an estimator in the form 
Xt + Do,t\ogF\ t G [0,r], we have to justify all the applications of the chain rule above. 
Indeed, the only non-trivial step is to prove the following estimate, for every u G Hq: 

,-il 




{B{SX),{5X)y 


< oo. 


In turn, this holds true because we may pass to the joint law of 5X = {5iX)'^^^^ which is 
Gaussian non-degenerate (possibly non-centred) and the integrand can then be estimated 
from above by some constant times the function x i—?• \x\~‘^ (here the assumption n > 3 plays 
a role too). 

Next, to prove e.g. that logF^ G with 


Dt log F = 2 a- 


for a.e. t G [0, T], 


{Bi6X),{5X)) 

it is sufficient to notice that, assuming this identity true, then we could estimate, by Cauchy- 
Schwarz inequality, 

rT 


[\Dt log F^y]dt < 4a^rtrace(H)E“ {B{5X), {6X)) 


-1 


This a priori estimate entails logT^ G D^’^, by suitably approximating the function 2:1— )■ log 2 
with smooth functions. 

Similarly, to estimate IE[||^||^a, 2 ]j we apply Cauchy-Schwarz and deduce, for s, t G [0,T], 
with s < t, 

E“[|l),,tlogF2|2] <4a2(t-s)trace(H)E“ [(H((5X), (dX))"^ 
which can be integrated with respect to (recall that a G (0,1/2)). 


In conclusion, the example above shows that, in the case of deterministic shifts, i.e., 0 = 
Hq, we are able to explicitly build super-efficient Stein-type estimators. Although it seems 
reasonable, we do not know whether this technique can be extended to stochastic shifts; it 
would be even more interesting to provide super-efficient adapted estimators, see also Remark 
2.3 above. 


Remark 5.2 (Stein estimators for Cox processes). In case of Cox processes, nothing prevents us 
from performing similar argument using, in place of Malliavin calculus, the calculus sketched 
in Remark 4.3. The case of Poisson processes and L^(//)-type risks is investigated in [PR09]. 
However, here we currently face a strong limitation to provide explicit examples, due to the 
possible dependence upon u (i.e.. A) of the Malliavin calculus. Let us remark that a similar 
limitation is also present in [PROD] and perhaps, at least in the one-dimensional parametric 
cases considered in [PROD, Section 5], one might similarly provide explicit examples of super¬ 
efficient estimators also with respect to Sobolev risks, but the general, infinite-dimensional 
parametric problem would still be open. 
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